Facilitated diffusion on mobile DNA: configurational traps and sequence heterogeneity 
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We present Brownian dynamics simulations of the facilitated diffusion of a protein, modelled as 
a sphere with a binding site on its surface, along DNA, modelled as a semi-flexible polymer. We 
consider both the effect of DNA organisation in 3D, and of sequence heterogeneity. We find that 
in a network of DNA loops, as are thought to be present in bacterial DNA, the search process 
is very sensitive to the spatial location of the target within such loops. Therefore, specific genes 
might be repressed or promoted by changing the local topology of the genome. On the other hand, 
sequence heterogeneity creates traps which normally slow down facilitated diffusion. When suitably 
positioned, though, these traps can, surprisingly, render the search process much more efficient. 

PACS numbers: 



In living cells, proteins routinely need to reach a tar- 
get positioned on the DNA, e.g. to initiate transcription 
of one gene, or to silence or suppress another. Impor- 
tantly, the search for the target has to be both rapid 
and efficient. Most experimental results suggest that, 
within bacterial cells, this process takes about two orders 
of magnitude less time than one would estimate by as- 
suming unbiased 3D protein diffusion [ll-Q- How is such 
an efficient search realised in practice? The commonly 
accepted theory is that when seeking their target, pro- 
teins alternate between phases of free diffusion through 
the cytoplasm, and phases in which they slide along the 
DNA, effectively performing ID diffusion along its back- 
bone 4 6]. This combined strategy is known as "facili- 
tated diffusion" [Ml- 

A simple scaling argument to predict the magnitude 
of the mean search time, r s , that a protein needs to find a 
target on the DNA, is as follows. The key parameters are 
the DNA length L, the volume of the cell V, the 3D and 
ID diffusion coefficients, respectively D3 and D\ (experi- 
ments suggest Di < D 3 13]), and, crucially, the "sliding 
length" , l s . This is defined as the typical length of DNA 
which the protein explores during one episode of ID dif- 
fusion. Via dimensional analysis, one can estimate a typ- 
ical time spent on a 3D excursion as T3D ~ V/ D3L, while 
a typical sliding time is txd ~ l 2 s /D\. Furthermore, the 
mean number of 1D-3D search rounds is N s ~ L/l s [3]. 
One can combine these formulae to estimate r s by sum- 
ming the time spent performing 3D and ID diffusion, 



T a = N s {t 1u + t 3B ) ~ A 



where A and B are geometry-dependent constants which 
cannot be inferred from simple scaling H 0]. The 
most important result from the theory is that there is 
an optimal sliding length which minimises r s , given by 
I* = y/(ADiV)/{BDzL). With typical parameters for 
bacteria and assuming A ~ B one finds that l s is a few 
tens of nm. 

While appealing, theoretical approaches building on 
Eq. (Q~]) commonly rely on several approximations in order 



to make progress. Analytical models usually schematise 
DNA as a structureless polymer (or assume that the poly- 
mer configuration changes on a timescale much quicker 
than that of the protein movement [IH ) , and also neglect 
intersegmental transfers, whereby the protein moves di- 
rectly (i.e. without a 3D excursion) between two DNA 
regions which are close in 3D space, but can be far apart 
along the DNA backbone. On the other hand, simula- 
tions 15- 1^1, usually treat the DNA as frozen (an excep- 
tion is the lattice study in [3]), and disregard the base 
pair sequence of DNA. 

Here we present a coarse grained simulation of the 
search process where we relax these two drastic approx- 
imations: we include the dynamics of all components 
(DNA and proteins), and we consider a heterogeneous 
DNA. We find that both aspects are crucial players in 
determining how fast facilitated diffusion is. First, we 
analyse the search process on a string of rosettes, which 
better represents the conformation of prokaryotic DNA 
as inferred from experiments 2(J 21 1. We find that the 
relative position of the target with respect to the network 
may change t s by orders of magnitude. This giant effect 
cannot be captured by the theory in Eq. ([T]), in which the 
target placement is immaterial. These findings suggest 
that by changing the local DNA conformation it should 
be possible to silence or express a given gene. Second, if 
the DNA-protein interaction is sequence-dependent 22| , 
in general this slows down facilitated diffusion. However, 
through a careful design of the DNA sequence, we show 
that one can create a diffusional "funnel" that drives the 
protein to its target much more quickly. 

In this work we used Brownian dynamics (BD) simu- 
lations in which we coarse grained DNA as a bead-and- 
spring polymer. Each of the N beads in the DNA had 
a diameter a ~ 2.5 nm, and neighbouring beads were 
connected by finitely extensible nonlinear elastic (FENE) 
springs. Proteins were modelled as spherical particles 
with a diameter of 3cr, with a spherical patch of radius 
a centred Lie away from the protein centre (Fig. 1A). 
Only the latter was sticky for the DNA, via a (truncated) 
Lennard- Jones (LJ) interaction. All other coarse-grained 
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FIG. 1: (A) Snapshot of a DNA segment and a model pro- 
tein. (B) Mean search time r s for frozen and mobile DNA, 
as a function of DNA-protein affinity, e. Parameters are 
L = 500cr, V ~ 50000<t 3 (the fraction of the volume occupied 
by the DNA is therefore ~ 1%), while data were averaged 
over 500 search runs. The lines are a fit of the data with 
Eq. © 26]. (C) Plots of the ID diffusion coefficient, Di, and 
the sliding length, as a function of e. (D) Dependence of t s 
on affinity for various DNA lengths, L (in units of a), with 
fixed V ~ 50000ct 3 . 

beads interact via a purely repulsive potential which cap- 
tures steric effects (this is achieved by truncating a LJ 
potential at a mutual distance of 2 1 / 6 er). Finally, three 
neighbouring beads along the DNA are subjected to an 
additional force which models DNA semi-flexibility. Such 
a force comes from the gradient of the Kratky-Porod 
potential [23(; this can be expressed as Kcos9, where 
K = k B Tl p /a (l p — 20 o for DNA), and 6 is the angle 
between the three neighbouring beads. 

We will refer to the full potential, including both LJ, 
Kratky-Porod and FENE terms, as U. If we denote the 
position of the i-th sphere in the simulation as Xj, its evo- 
lution is determined by the following Langevin equation, 

where ji is the friction felt by the particle, Vj = , ks 
is the Boltzmann constant, T is the temperature, mi is 
the mass of the i— th bead, and £j(t) is an uncorrelated 
Gaussian noise with zero mean and unit variance [24| . All 
simulations were performed via the LAMMPS code (25j . 

Fig. IB shows the mean search time as a function of 
the DNA-protein affinity e (the depth of the attractive 
LJ potential, measured in units of fcsT), for the cases in 
which the DNA is either frozen (into a randomly chosen 
equilibrium configuration) or mobile. Since the sliding 



length increases with e (Fig. 1C), our results are consis- 
tent with Eq. ([1]) but now there is an optimal value e* 
which minimises r s . Unlike in the theory we also observe 
a dependence of D\ on e (Fig. 1C), which comes from 
the presence of energy barriers felt by the protein while 
sliding - in our case these are mainly due to the granu- 
larity of our polymer description, but they are likely to 
be present for real DNA as well, due to the modulations 
in the major and minor grooves, and the curvature of the 
DNA. Experimentally D% has been shown to vary over a 
large range of values for different conditions, and DNA 
sequences 

Intriguingly, freezing the DNA leads to a much slower 
search, especially for large e. Our simulations also show 
that increasing the amount of genome available in the 
search volume, i.e. increasing L at a fixed V, hinders, 
rather than helps, facilitated diffusion, unless the affinity 
is very small (Fig. ID). While the latter effect can be 
readily predicted from Eq. ([1]), understanding the differ- 
ence between the frozen and moving DNA cases requires 
a more detailed analysis of the protein trajectories in 
our numerical experiment. As one might expect, the 3D 
search time, T3D, which is dominant for small e, is larger 
(a ~ 40% difference) for the frozen DNA; however we also 
observe an almost 2-fold larger value of tid f° r the frozen 
case. Fig. 1C shows that while l s is similar for the cases 
of mobile and frozen DNA, Di changes significantly, i.e. 
it is smaller in the case of frozen DNA. We ascribe this 
difference to the fact that, when mobile, the DNA is able 
to adjust locally to the presence of the protein, and hence 
can smooth out some of the energy barriers which slow 
down the ID sliding. Once the measured values of l s , Di, 
A and B [26| are put into Eq. ([1]), this actually provides 
a good fit to our data, for both mobile and frozen DNA, 
as shown in Fig. IB. The small residual error may arise 
from the presence of the previously mentioned "interseg- 
mental transfers" , which are neglected by the theory - 
indeed their presence somewhat changes the meaning of 
l s . While traditionally l s is the length over which the 
protein "slides" during each encounter with the DNA, 
we here define it simply as the number of distinct DNA 
bead visited during the encounter — whether consecu- 
tive along the contour, or separated due to intersegmen- 
tal transfers. Such events are present in our simulations, 
and are more common in the mobile DNA case. 

The DNA conformations found in vivo in bacteria, 
while not yet well characterised, are likely to be quite 
far from those of the self- avoiding polymer normally con- 
sidered in the theories for this process, and which we 
studied in Fig. 1. Within the prokaryotic cytosol, DNA 
is known to be highly looped, due to the presence of 
DNA-binding architectural proteins such as condensins 
- this helps to achieve the compaction which is required 
to fit the whole genome within the narrow volume of a 
single cell. Therefore we consider in Fig. 2 the dynamics 
of a protein searching for its target on a DNA which is 
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FIG. 2: (A) Snapshot of a string of rosettes. (B) Mean search 
time for a protein on a string of 3 rosettes with 5 loops (each 
of length 20a) on a DNA of length L = 380a with V ~ 
36000a 3 (1% DNA volume fraction), for different positions of 
the target: (i) between rosettes (bead 130), (ii) in the middle 
of a loop in a rosette (bead 190), and (iii) in the centre of 
a rosette (bead 180), as indicated in (A). The curve is the 
fit to the theory in Eq. [1] [2^]. (Choosing a different number 
and size of loops leads to qualitatively similar results.) (C) 
Time series of the DNA-bead (s) nearest to the protein at a 
given instant, showing trapping close to a rosette centre (e = 
5.9ksT). Dashed lines separate beads belonging to different 
rosettes. 



made up of a string of rosettes, each of which consists 
of a series of loops joined together (see Fig. 2A). This 
idealised conformation gives a realistic local view of bac- 
terial DNA according to a number of biological models 
(see e.g. [2l|) and is simple enough to be included in our 
modelling. 

Fig. 2B shows the mean search time t s as a function 
of e, for three different target positions: (i) in the centre 
of a rosette, (ii) in the middle of a loop in a rosette, and 
(iii) between rosettes. Our results show that when the 
affinity between the protein and the DNA is small, so 
that 3D diffusion dominates over ID diffusion during the 
search, it takes much longer to find a target in the centre 
of a rosette. Such a target is more difficult to reach as the 
surrounding loops are in the way. Interestingly, this trend 
reverses for larger values of the affinity. To understand 
this, we observe that in the large e regime each of the 
rosettes acts as a trap for the protein, i.e. it spends 
a large amount of time in a rosette, before moving to 
another one (see Fig. 2C). Since sliding is the dominant 
transport mechanism, rather than acting as a shield, the 
loops allow the protein to slide into the centre of the 
rosette. Once there intersegmental transfers are more 
likely to keep the protein near that centre than take it 
elsewhere. Such a mechanism then renders it easier to 
find the target if it is close to one of the traps. 

Fig. 2 therefore demonstrates that DNA topology and 
target positioning, together with DNA protein affinity, 
can be used to control the relative ease with which dif- 
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FIG. 3: (A) Plot of t s as a function of e ns for three DNA 
sequences: (i) homogeneous DNA, (ii) randomly positioned 
traps and (iii) traps clustered so as to provide a funnel towards 
the target. (B) Schematic showing the DNA-protein affinity 
for each bead in a typical section of DNA, for the two different 
trap arrangements. In the funnel case traps alternate with 
non-specific beads. In each case the target is at bead 500, 
L = 1000 and V ~ 10 5 a 3 (1% DNA volume fraction). 



ferent regions of the genome can be accessed by proteins. 
We highlight that this conclusion is outside the scope 
of most facilitated diffusion theories based on arguments 
such as that in Eq. ([1]) , in which the position of the target 
does not feature. More quantitatively, we have computed 
T3D and tid, as well as D%, and l s from our data, and 
found that while T3D ~ V/D^L still holds, it is not possi- 
ble to fit t id to the functional form Bl 2 s /Di throughout 
the e range considered here (not shown). This is be- 
cause the rosette structure introduces large correlations 
between the points where the protein leaves and rejoins 
the DNA for each 3D excursion, meaning that N s is very 
sensitive to the target position and poorly predicted by 
Eq. Q (see Fig. 2B). 

We now turn to the discussion of another aspect found 
in real DNA and commonly neglected in theoretical work: 
sequence heterogeneity. The DNA sequence leads to a 
non-uniform free energy landscape for a protein sliding 
along it. In order to describe such a landscape, we al- 
low the DNA-protein interaction to vary from one DNA 
bead to another, with the bead-dependent affinity set as 

f27|. There 



prescribed by the model proposed in Ref. 
it was postulated that there exists two possible states 
for a protein attached to the genome: it can either bind 
in a non-specific mode - with constant affinity e ns , or 
in a sequence-dependent, specific, mode - with affinity 
larger than e ns . The model in 27J assumes that the two 
states are in equilibrium, so the protein will be found 
in whichever state offers a stronger interaction. In our 
simulations, for DNA bead s we choose a specific inter- 
action strength e s (s) according to an appropriate distri- 
bution 13, 27 1; the affinity for that bead is then taken 
to be whichever is the larger of e s (s) and e ns . In practice 
this leads to a free energy profile with most beads favour- 
ing the nonspecific interaction strength e ns , with a small 
number of "traps" with a greater interaction energy. Un- 
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like those of the rosettes considered in Fig. 2, which are 
determined by the 3D structure of the DNA, such traps 
are encoded in the ID sequence of bases. 

Fig. 3 shows the dependence of t s on e ns for a DNA 
chain with L = IOOOct (corresponding to ~ 7350 base 
pairs), where the trap strength and number of traps 
have been determined on the basis of the statistics for 
the binding of a "typical" bacterial transcription factor 
(TF) 0, 0, Sj- We focused on the case in which the 
target :protein interaction energy is larger than the affin- 
ity with any of the traps, which is the most common for 
real TFs [10(. We compared the case of homogeneous 
DNA with a nonspecific interaction e ns , with two inho- 
mogeneous sequences: (i) that in which the position of 
the traps is random, leading to a "golf-course" free en- 
ergy landscape; and (ii) that in which the DNA sites with 
enhanced affinity for the proteins are clustered around 
the target (alternating non-specific and enhanced bind- 
ing beads) so as to provide a potential funnel driving the 
protein to it (see Fig. 3B). We refer to these two situa- 
tions as the golf-course and funnel case respectively. 

A general Kramers' argument suggests that the time 
the protein spends in a trap may be estimated as T trap = 
Toe (etrap-en S )/fci3T^ w h ere Tq ^ (j 2 / D \ is the time it takes 

a protein to move from one non-specifically interacting 
DNA bead to the next. It is therefore not surprising 
that this case leads to a far larger mean search time with 
respect to the homogeneous DNA case, where the binding 
of the protein to the genome is always "non-specific" (see 
Fig. 3A). If the search involved ID sliding along the 
DNA contour alone, one might expect that if the non- 
specific interaction e ns were increased at fixed et ra p then 
this would lead to an exponential decrease in the search 
time (in line with the decrease in trap depth); however, 
for facilitated diffusion, this is balanced by the increase 
in l s (above its optimum value) which leads to a slower 
search. 

The Kramers' argument does not apply to the funnel 
case, which eliminates traps other than near the target 
- intersegmental transfers from one "trap" to the next 
provide an alternative transport mechanism which avoids 
slowdown due to the rugged ID potential. One may then 
expect that r s should be similar to the one observed with 
uniform DNA, with some enhancement due to the bind- 
ing gradient which drives the protein towards the target 
once it is in its close proximity. Strikingly, the speed up 
with respect to the uniform case may instead reach about 
one order of magnitude (and more than two with respect 
to the golf-course case). This is probably because the 
presence of the funnel can decrease the likelihood of the 
protein being transported away from the vicinity of the 
target, even for small e ns [29j |. 

The dramatic difference between search efficiency in 
the golf-course and funnel case is a consequence of the 
assumption (from 27]) that proteins can bind to DNA 
either non-specifically or specifically, and the two states 



are in thermodynamic equilibrium so that the optimal 
binding for each site can be selected quickly. It is cur- 
rently not clear whether this is a correct assumption - an 
alternative suggestion [22|, |5(3] is that what matters may 
be the energy barrier between the specific and nonspecific 
bound states, rather than their absolute binding energy. 
If the energy barrier between the states was very large 
for all sites except the target, then our funnel sequence 
should not lead to much enhancement in the efficiency 
with respect to the random case. That is to say, the 
protein would see only a flat (non-specific) landscape ir- 
respective of the sequence, and the "funnel" would not 
be accessible to it. It would therefore be interesting to 
perform in vitro single molecule experiments analogous 
to those of Ref. 0], where the DNA sequence is either 
random or designed so as to create the funnel we consid- 
ered in Fig. 3. In this way one may directly test whether 
the predictions from our simulations hold, and hence de- 
termine which of the two theories mentioned above for 
DNA-protein binding applies in reality. 

In conclusion, we have presented Brownian dynamics 
simulations of the facilitated diffusion of a protein on 
DNA. Unlike previous numerical work, we have focused 
on the impact of 3D DNA conformation and sequence 
heterogeneity on the search dynamics. We have found 
that the presence of loops in the DNA may provide a 
way to tune the accessibility of a target on the genome, 
which cannot be accounted for by existing analytical the- 
ories. By considering a string of rosettes for the DNA 
conformations, we have seen that when the target is in 
the centre of a rosette and the DNA-protein affinity is 
small, the time needed to find it is larger than in the case 
when the target is positioned between rosettes. This ef- 
fect reverses for high affinity - in this regime each of the 
rosettes acts as a configurational trap, in the vicinity of 
which the protein lingers for a long time. While the con- 
formation of prokaryotic genomes may adopt far more 
complicated topologies than the string of rosettes which 
we have considered, our results are generic in predict- 
ing a dependence on the relative positioning of loops and 
targets. Hence we expect they should also apply to more 
disordered loop networks. Finally, we have considered 
the case of a heterogeneous DNA, where the affinity be- 
tween genome and protein is site-dependent, thereby in- 
troducing traps in the facilitated diffusion of the protein. 
When the sequence is random, these traps severely slow 
down the search process. However, when the sequence is 
designed so as to provide a funnel-like landscape around 
the target, the search may become much faster. Experi- 
ments to test this latter prediction should lead to a better 
understanding of the way proteins bind to DNA. 
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